RNN-T模型由于其在线流媒体模式下运营的竞争力和能力,因此在文献和商业系统中广受欢迎。在这项工作中,我们进行了一项广泛的研究,比较了单调和原始RNN-T模型的几种预测网络体系结构。我们根据普通的最新构象编码器比较4种类型的预测网络,并在LibrisPeech和内部医学对话数据集上获得报告结果。我们的研究涵盖了离线批处理模式和在线流媒体方案。与以前的一些作品相反,我们的结果表明,当用作预测网络以及构象异构体编码器时,变压器并不总是胜过LSTM。受分数启发的启发,我们提出了一个新的简单预测网络体系结构N-CONCAT,它在我们在线流式传输基准测试中的表现优于其他。变压器和N-Gram降低的体系结构的表现非常相似,但在先前的上下文方面具有一些重要的不同行为。总体而言,与LSTM基线相比,我们获得了多达4.1%的相对相对改善,同时将预测网络参数降低了几乎数量级(8.4倍)。
translated by 谷歌翻译
Artificial neural networks can learn complex, salient data features to achieve a given task. On the opposite end of the spectrum, mathematically grounded methods such as topological data analysis allow users to design analysis pipelines fully aware of data constraints and symmetries. We introduce a class of persistence-based neural network layers. Persistence-based layers allow the users to easily inject knowledge about symmetries (equivariance) respected by the data, are equipped with learnable weights, and can be composed with state-of-the-art neural architectures.
translated by 谷歌翻译
In the future, service robots are expected to be able to operate autonomously for long periods of time without human intervention. Many work striving for this goal have been emerging with the development of robotics, both hardware and software. Today we believe that an important underpinning of long-term robot autonomy is the ability of robots to learn on site and on-the-fly, especially when they are deployed in changing environments or need to traverse different environments. In this paper, we examine the problem of long-term autonomy from the perspective of robot learning, especially in an online way, and discuss in tandem its premise "data" and the subsequent "deployment".
translated by 谷歌翻译
A systematic review on machine-learning strategies for improving generalizability (cross-subjects and cross-sessions) electroencephalography (EEG) based in emotion classification was realized. In this context, the non-stationarity of EEG signals is a critical issue and can lead to the Dataset Shift problem. Several architectures and methods have been proposed to address this issue, mainly based on transfer learning methods. 418 papers were retrieved from the Scopus, IEEE Xplore and PubMed databases through a search query focusing on modern machine learning techniques for generalization in EEG-based emotion assessment. Among these papers, 75 were found eligible based on their relevance to the problem. Studies lacking a specific cross-subject and cross-session validation strategy and making use of other biosignals as support were excluded. On the basis of the selected papers' analysis, a taxonomy of the studies employing Machine Learning (ML) methods was proposed, together with a brief discussion on the different ML approaches involved. The studies with the best results in terms of average classification accuracy were identified, supporting that transfer learning methods seem to perform better than other approaches. A discussion is proposed on the impact of (i) the emotion theoretical models and (ii) psychological screening of the experimental sample on the classifier performances.
translated by 谷歌翻译
We introduce a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP). Our logic is the first to include operators for causal reasoning, allowing us to express interventional and counterfactual queries. Given a path formula $\phi$, an interventional property is concerned with the satisfaction probability of $\phi$ if we apply a particular change $I$ to the MDP (e.g., switching to a different policy); a counterfactual allows us to compute, given an observed MDP path $\tau$, what the outcome of $\phi$ would have been had we applied $I$ in the past. For its ability to reason about different configurations of the MDP, our approach represents a departure from existing probabilistic temporal logics that can only reason about a fixed system configuration. From a syntactic viewpoint, we introduce a generalized counterfactual operator that subsumes both interventional and counterfactual probabilities as well as the traditional probabilistic operator found in e.g., PCTL. From a semantics viewpoint, our logic is interpreted over a structural causal model (SCM) translation of the MDP, which gives us a representation amenable to counterfactual reasoning. We provide a proof-of-concept evaluation of our logic on a reach-avoid task in a grid-world model.
translated by 谷歌翻译
Brain decoding is a field of computational neuroscience that uses measurable brain activity to infer mental states or internal representations of perceptual inputs. Therefore, we propose a novel approach to brain decoding that also relies on semantic and contextual similarity. We employ an fMRI dataset of natural image vision and create a deep learning decoding pipeline inspired by the existence of both bottom-up and top-down processes in human vision. We train a linear brain-to-feature model to map fMRI activity features to visual stimuli features, assuming that the brain projects visual information onto a space that is homeomorphic to the latent space represented by the last convolutional layer of a pretrained convolutional neural network, which typically collects a variety of semantic features that summarize and highlight similarities and differences between concepts. These features are then categorized in the latent space using a nearest-neighbor strategy, and the results are used to condition a generative latent diffusion model to create novel images. From fMRI data only, we produce reconstructions of visual stimuli that match the original content very well on a semantic level, surpassing the state of the art in previous literature. We evaluate our work and obtain good results using a quantitative semantic metric (the Wu-Palmer similarity metric over the WordNet lexicon, which had an average value of 0.57) and perform a human evaluation experiment that resulted in correct evaluation, according to the multiplicity of human criteria in evaluating image similarity, in over 80% of the test set.
translated by 谷歌翻译
Autoregressive processes naturally arise in a large variety of real-world scenarios, including e.g., stock markets, sell forecasting, weather prediction, advertising, and pricing. When addressing a sequential decision-making problem in such a context, the temporal dependence between consecutive observations should be properly accounted for converge to the optimal decision policy. In this work, we propose a novel online learning setting, named Autoregressive Bandits (ARBs), in which the observed reward follows an autoregressive process of order $k$, whose parameters depend on the action the agent chooses, within a finite set of $n$ actions. Then, we devise an optimistic regret minimization algorithm AutoRegressive Upper Confidence Bounds (AR-UCB) that suffers regret of order $\widetilde{\mathcal{O}} \left( \frac{(k+1)^{3/2}\sqrt{nT}}{(1-\Gamma)^2} \right)$, being $T$ the optimization horizon and $\Gamma < 1$ an index of the stability of the system. Finally, we present a numerical validation in several synthetic and one real-world setting, in comparison with general and specific purpose bandit baselines showing the advantages of the proposed approach.
translated by 谷歌翻译
The fifth generation of the Radio Access Network (RAN) has brought new services, technologies, and paradigms with the corresponding societal benefits. However, the energy consumption of 5G networks is today a concern. In recent years, the design of new methods for decreasing the RAN power consumption has attracted interest from both the research community and standardization bodies, and many energy savings solutions have been proposed. However, there is still a need to understand the power consumption behavior of state-ofthe-art base station architectures, such as multi-carrier active antenna units (AAUs), as well as the impact of different network parameters. In this paper, we present a power consumption model for 5G AAUs based on artificial neural networks. We demonstrate that this model achieves good estimation performance, and it is able to capture the benefits of energy saving when dealing with the complexity of multi-carrier base stations architectures. Importantly, multiple experiments are carried out to show the advantage of designing a general model able to capture the power consumption behaviors of different types of AAUs. Finally, we provide an analysis of the model scalability and the training data requirements.
translated by 谷歌翻译
Autonomous driving has a natural bi-level structure. The goal of the upper behavioural layer is to provide appropriate lane change, speeding up, and braking decisions to optimize a given driving task. However, this layer can only indirectly influence the driving efficiency through the lower-level trajectory planner, which takes in the behavioural inputs to produce motion commands. Existing sampling-based approaches do not fully exploit the strong coupling between the behavioural and planning layer. On the other hand, end-to-end Reinforcement Learning (RL) can learn a behavioural layer while incorporating feedback from the lower-level planner. However, purely data-driven approaches often fail in safety metrics in unseen environments. This paper presents a novel alternative; a parameterized bi-level optimization that jointly computes the optimal behavioural decisions and the resulting downstream trajectory. Our approach runs in real-time using a custom GPU-accelerated batch optimizer, and a Conditional Variational Autoencoder learnt warm-start strategy. Extensive simulations show that our approach outperforms state-of-the-art model predictive control and RL approaches in terms of collision rate while being competitive in driving efficiency.
translated by 谷歌翻译
Just like in humans vision plays a fundamental role in guiding adaptive locomotion, when designing the control strategy for a walking assistive technology, Computer Vision may bring substantial improvements when performing an environment-based assistance modulation. In this work, we developed a hip exosuit controller able to distinguish among three different walking terrains through the use of an RGB camera and to adapt the assistance accordingly. The system was tested with seven healthy participants walking throughout an overground path comprising of staircases and level ground. Subjects performed the task with the exosuit disabled (Exo Off), constant assistance profile (Vision Off ), and with assistance modulation (Vision On). Our results showed that the controller was able to promptly classify in real-time the path in front of the user with an overall accuracy per class above the 85%, and to perform assistance modulation accordingly. Evaluation related to the effects on the user showed that Vision On was able to outperform the other two conditions: we obtained significantly higher metabolic savings than Exo Off, with a peak of about -20% when climbing up the staircase and about -16% in the overall path, and than Vision Off when ascending or descending stairs. Such advancements in the field may yield to a step forward for the exploitation of lightweight walking assistive technologies in real-life scenarios.
translated by 谷歌翻译